Skip to content

Conversation

@wbruna
Copy link

@wbruna wbruna commented Jun 13, 2025

This change makes the VAE threshold limit configurable, and trigger by area instead of max image size.

Motivation:

As I mentioned on the commit messages, I'm keeping the old behavior by default just in case, but triggering by image area should be safe since the VAE buffer size is consistently proportional to the image area (exactly 6656 bytes per image pixel in my tests).

I don't know what would be the best way to deal with the config option, though. Currently, we have:

  • notile=False : default, tile above 768 on either side
  • notile=True : never use VAE tiling

On the backend, I replaced it with a tiled_vae_threshold, meaning:

  • tiled_vae_threshold<0: never use VAE tiling
  • tiled_vae_threshold==0: default, tile above 768 on either side
  • tiled_vae_threshold>0: tile above tiled_vae_threshold²

because this maps directly to the current notile config: notile==True <=> tiled_vae_threshold==-1, and notile==False <=> tiled_vae_threshold==0. But using negatives for the default value would be simpler:

  • tiled_vae_threshold<0: default, tile above 768 on either side
  • tiled_vae_threshold>=0: tile above tiled_vae_threshold²

Also, I'm not sure about how you usually replace a command-line flag or config option. Is the old one kept around for compatibility? Is it replaced on newer .kcpps files, or kept as-is and just ignored if the newer option is defined?

@wbruna
Copy link
Author

wbruna commented Jun 13, 2025

(and please don't worsen your weekend on my account 🙂)

@LostRuins
Copy link
Owner

Just wondering, why would you need this to be configurable? Is VAE tiling resulting in poor performance for you? Because tiling can be easily toggled off.

@wbruna
Copy link
Author

wbruna commented Jun 14, 2025

Well, I tried to explain in the 'motivation' above: it's mainly useful for lowering the threshold, to reduce peak memory usage. Forcing it off is, as you say, easy enough; but there's currently no way to force it on, and changing the threshold value is simpler than adding another flag.

@LostRuins
Copy link
Owner

LostRuins commented Jun 14, 2025

Ah i get it. I should change --sdnotile to --sdtilevae [on|off|auto] maybe?
but actually i've been thinking of lowering the threshold to anything over 512
cause people can simply disable it if they dont like it.
what do you think?

@wbruna
Copy link
Author

wbruna commented Jun 14, 2025

Yeah, I think that would be enough; although controlling it directly with a turn-on-VAE-tiling-above-this-resolution integer could be simpler and more useful (even if it's by image sides instead of area). Something like --sdtilevae 550 would trigger for images above 512, --sdtilevae 0 would force it on, --sdtilevae 8192 force it off, etc.

@LostRuins LostRuins force-pushed the concedo_experimental branch from fe14845 to 238be98 Compare June 14, 2025 04:00
@wbruna
Copy link
Author

wbruna commented Jun 14, 2025

Related: #1603

By the way, have you considered applying your Koboldcpp changes to sd.cpp on a sd.cpp public fork first? It could help tracking down which bugs came from upstream, and which are specific to Koboldcpp; at least until the whole sd.cpp maintenance issue gets sorted out.

@LostRuins
Copy link
Owner

I don't have many changes to upstream actually, most of it is getting it to work from a server context since sd.cpp is primarily local file i/o based.

@wbruna wbruna force-pushed the koboldcpp_vae_threshold branch from 4ac7c4c to df69607 Compare June 20, 2025 13:16
wbruna added 3 commits June 20, 2025 10:40
I've tested with GGML_VULKAN_MEMORY_DEBUG all resolutions with
the same 768x768 area (even extremes like 64x9216), and many
below that: all consistently allocate 6656 bytes per image pixel.
As tiling is primarily useful to avoid excessive memory usage, it
seems reasonable to enable VAE tiling based on area rather than
maximum image side.

However, as there is currently no user interface option to change
it back to a lower value, it's best to maintain the default
behavior for now.
This allows selecting a lower threshold value, reducing the
peak memory usage.

The legacy sdnotile parameter gets automatically converted to
the new parameter, if it's the only one supplied.
@wbruna wbruna force-pushed the koboldcpp_vae_threshold branch from df69607 to 18b24ae Compare June 20, 2025 13:44
@wbruna
Copy link
Author

wbruna commented Jun 20, 2025

I gave the interface a try, directly by image size: 0 keeps it at the default 768, -1 disables tiling, other values trigger VAE tiling above those sizes.

Seems to be working well; I was able to easily get the VAE step under 2G, or even 1G VRAM. The new parameter doesn't look much more complicated than the older sdnotile: the user can still quickly disable tiling with -1, or leave it at the default with 0.

And I'm not sure if that's a concern, but if the user only provides the --sdnotile parameter, I try to convert it silently to the new one, so things like user launching scripts should keep working without changes.

@wbruna wbruna marked this pull request as ready for review June 20, 2025 14:10
@LostRuins LostRuins added the enhancement New feature or request label Jun 20, 2025
@LostRuins
Copy link
Owner

Alright looks good, I simplified it a bit since there's no need to keep the legacy logic. We'll just use your area calculations as the threshold.

Also instead of having too many magic numbers I made it a bit more intuitive. The GUI launcher shows the tiling threshold (defaults to 768). If set to 0, tiling is off, otherwise it's on and at the configured value. That's it.

(It has the side effect that setting it to "1" also guarantees tiling since nothing will be smaller than 1x1)

@LostRuins LostRuins merged commit 08adfb5 into LostRuins:concedo_experimental Jun 21, 2025
@wbruna
Copy link
Author

wbruna commented Jun 21, 2025

Working fine on 1.94! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants